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Abstract 



A time-inconsistent optimal control problem is formulated and studied for a controlled linear 
ordinary differential equation with quadratic cost functional. A notion of equilibrium control is 
. introduced, which can be regarded as a time-consistent solution to the original time-inconsistent 

I problem. Under certain conditions, we constructively prove the existence of such an equilibrium 

(-H ' control which is represented via a forward ordinary differential equation coupled with a backward 

Riccati-Volterra integral equation. Our constructive approach is based on the introduction of 
a family of iV-person non-cooperative differential games. 



Keywords. Time-inconsistency, linear-quadratic optimal control problem, equilibrium control, 
J> . multi-level hierarchical differential games, backward Riccati-Volterra integral equation. 

in ■ AMS Mathematics subject classification. 49L20, 49N10, 49N70, 91A23. 

00 ■ 

! 1 Introduction — Time- Consistency Issue. 

O 



We begin with a classical optimal control problem for an ordinary differential equation (ODE, for 
short). Let T > 0. For any initial pair {t,x) € [0, T) x M", consider the following controlled ODE: 

X{s) = b{s,X{s),u{s)), se[t,T], ^^^^ 
X{t) =x, 

where b : [0,T] x x U ^ M" is a given map, u{-), a function valued in some metric space U, is 
called a control, and X{-) is called the state trajectory. We denote 



t, s] = |n : [t, s]^U \ u{-) is measurablej, VO < t < s < T. (1.2) 



Under some mild conditions, for any initial pair {t,x) € [0,T) xM", and u{-) G U[t,T], (1.1) admits 
a unique solution X{-) = X{- ]t, x,u{-)). Then we can introduce the following cost functional which 
measures the performance of the control n(-): 

J{t,x;u{-))= [ g{s,X{s;t,x,u{-)),u{s))ds + h{X{T-t,x,u{-))), (1.3) 
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for some given maps g : [0, T] x x ?7 — )• R and h : — M. The terms on the right hand side 
of (1.3) are referred to as the running cost and the terminal cost, respectively. The following is a 
classical optimal control problem. 

Problem (D). For any given initial pair {t,x) G [0,T) x M", find a u{-) G U[t,T] such that 

J(t,x;u(-)) = inf J(t,x;u(-)). (1.4) 

u{-)£U[t,T] 

Any n(-) G U[t,T] satisfying the above is called an optimal control for {t,x), X{-) = X{- ;t,x,'u(-)) 
is called the corresponding optimal trajectory, and (X (•),«(•)) is referred to as an optimal pair. 

Note that sometimes we might encounter the following seemingly a little more general form of 
the cost functional 

_ /-T J, 

J{t,x;ui-)) = J e-^t<^^^i^)Mr))drg(^^^j^^^^^^^^-^^^^^^-J^ c{r,X{r)Mr))d^ ^^^^ 

with c(-) being some map taking nonnegative values, which may be called a discount map. A special 
case is c(-) = 5 > 0, a positive constant (which is call a discount rate). Due to its form, the term 
e~/t c{r,x{r),u{r))dr ^g^^g^ exponential discounting. If we introduce 

xO(s) = c(s,X(s),«(s)), se[t,T], ^^g^ 
I X'{t) = 0, 

and regard X^{-) an additional component of the state, then the state equation is augmented by 
one dimension and the cost functional becomes 

/T 
e-^°(^)5(s, X{s),u{s))ds + e-^°('^)/i(X(T)). (1.7) 

which is of form (1.3). Therefore, an optimal control problem with an exponential discounting can 
be transformed to an optimal control problem without exponential discounting. In another word, 
containing an exponential discounting in the cost functional does not make the original problem 

mathematically more general. 

Dynamic programming method is a powerful classical approach to Problem (D). This method 
suggests us define the value function of Problem (D) by the following: 

V{t,x) = inf J{t,x;u{-)), (t,x) G [0,r) x R", 

u(-)eU[t,T] (1.8) 

^ V{T,x) = h{x), Mx G M". 
It is well-known that the following Bellman's principle of optimality holds ([25]): 

V{t,x)= \ni {rg{s,X{s),u{s))ds + V{T,X{T))\, VrG[t,T]. (1-9) 
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Now, if u{-) G U[t,T] is an optimal control for the initial pair {t,x), then, from the above, for any 

TG (i,r), 

V{t, x) = J{t, x; «(■)) = g{s, X{s),u{s))ds + h{X{T)) 

= g{s, X{s),u{^))ds + J(r, X(r); u\^^^^^ (•)) (^-^O) 

> g{s,X{s),u{s))ds + F(r, X(t)) > V{t, x). 
Hence, all the equalities in the above have to hold. Consequently, 

inf J(r,X(r);0) = F(r,X(r)) = J(r,X(r);n| (•)). (1.11) 

This means that for any < t < t < T, the restriction tt|j^yj(-) G U\t,T] of optimal control 
■u(-) G W[t,T] for the initial pair {t,x) on the time interval [t,T] is optimal for the initial pair 
(r, X{t)). Such a phenomenon is referred to as the time- consistency of Problem (D). The advantage 
of the time-consistency is that one needs only to solve Problem (D) for a given initial pair (i, x) G 
[0, T) X W^, and as time goes by, the restriction of the optimal control u{-) for (t, x) on any later time 
interval [r, T] will automatically be an optimal control for the corresponding initial pair (r, X(r)). 

However, common sense tells us that the time-consistency issue in real life is actually never so 
simple. There are two main reasons: First, as time goes by, the environment (in the broad sense) 
is changing, for example, invention of new technology, new limits of resource allocation, etc., and 
therefore the controlled system has to be modified according to the new initial pairs; and secondly, 
people keep changing minds/objectives, which leads to the change of cost functional. Due to these 
changes, one expects some dramatic changes in the formulation of optimal control problems, as 
well as the solutions to the problems. 

To make our statement more appealing from mathematical point of view, let us look at a very 
simple illustrative example. Consider a one-dimensional controlled ODE: 

X{s) = u{s), se[t,T], 



I X{t) = X, 



with cost functional 



J{t,x;u{-)) = u{sfds + h{t)X{T;t,x,u{-)f, (1.13) 

where h : [0,T] [6,oo), for some 6 > 0, and U = We pose the following optimal control 
problem. 

Problem (C). For given {t,x) G [0,r) x M, find a u{-) G U[t,T] such that 

J(i,x;u(-))= ^ inf J{t,x;u{-)). (1.14) 

Note that the above problem looks like a simple standard linear quadratic optimal control 
problem (LQ problem, for short), except that the terminal weight h{t) depends on the parameter 
t (which is the initial time of the problem). 
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It is clear that for any initial pair (t, x) € [0,r) x M, u{-) (->■ J{t,x]u{-)) is convex and coercive. 
Thus, there exists a unique optimal control for Problem (C). We can show that (see the Appendix) 
the optimal control of Problem (C) is given by 

u{s) = u{s;t,x) = -^—^^l^r—^, s e [t,T], (1.15) 
and the corresponding optimal trajectory is given by 

X(s;t..M-))-^ \l';^^Z^ , selt.n (1.16) 
Now, for r G {t,T), we consider Problem (C) on [r, T] with initial state 

y = X(t; t, X, ui-)) = x ^ ^ ~ ^] . (1.17) 

^ V ' ' ' V ;y l + h{t){T-t) ^ ' 

We can show that 

T( -n^ -f l( x^[h{r)-h{t)f{T-r) 

This tells us that the restriction of u{- ; t, x) on [r, T] is not optimal for Problem (C) with initial 
pair (t, X{T;t,x)), in general. Such a phenomenon is called time-inconsistency. 

Qualitative analysis on time-inconsistent behaviors can at least be traced back to the works by 
Hume [10] in 1739 and by Smith [22] in 1759. Later relevant works were made by Malthus [14] 
in 1828, Jcvons [11] in 1871, Marshall [16] in 1890, Bohm-Bawerk [4] in 1891, and Parcto [19] in 
1909, and so on. Mathematical formulation of time-inconsistency was firstly presented by Strotz 
[23] in 1955, followed by Pollak [21], Peleg-Yaari [20], Goldman [7], Laibson [13], etc. See Palacios- 
Huerta [18] for an interesting survey on the history. The above-mentioned mathematical works, 
starting from Strotz, mainly studied problems for either discrete dynamic systems or simple ODEs, 
involving non-exponential discounting, meaning that in the cost functional (see (1.5) with c(-) = 6), 
the classical exponential discounting Q-^i^-t) ig replaced by a function h{s — t). Recently, Ekeland- 
Lazrak [5] and Ekeland-Pirvu [6] continued the study of non-exponential discounting problems 
both for simple ODEs and SDEs. At the same time, Basak-Chabakauri [1] and Bjork-Murgoci [3] 
started to discuss the problems with the cost/payoff functional depending on the initial pair {t,x). 
We refer to [8], [9], [12], [17], [24] for some relevant results. 

In general, for any given initial pair {t, x) G [0, T) x R", we can consider the following controlled 
system: 

X{s) = b{t, X, s, X{s),u{s)), s e [t, T], 
I X{t) = X, 



with the cost functional: 



/I 
g{t, X, s, X{s),u{s))ds + h{t, x, X{T)). (^-^O) 

We point out that state equation (1.19) and cost functional (1.20) are significantly different from 
(1.1) and (1.3), respectively, due to the way they depend on the initial pair {t,x) £ [0,T) x M". 

4 



Such a dependence allows us to catch some situations that people will modify the control system 
and/or the cost functional at different initial pair. Clearly, our setting is much more general than 
[5]. Naturally, one could pose the following optimal control problem. 

Problem (N). For {t,x) G [0,r) x M", find u{-) G U[t,T] such that 

J{t,xM-))= J{t,x;u{-))- (1-21) 



It is clear that Problem (C) is a special case of Problem (N). Hence, Problem (N) is time- 
inconsistent, in general. Any optimal control u{-) G lA[t,T] of Problem (N) is referred to as a 
pre- committed optimal control on [t,T]. Due to the time-inconsistency, finding an optimal control 
u{-) G T] for Problem (N) (assuming it exists) might not be very useful (if it is not useless) in 
long run. Hence, Problem (N) is natural, but is a little too naive. 

In this paper, we will concentrate on a linear-quadratic time-inconsistent control problem. We 
will present a time-consistent solution via a "sophisticated" approach. The main idea comes from 
the works [23], [21], [20], and [7]. Here is a brief description. Take a partition A : = to < ii < 
■ ■ ■ < tisi = T oi the time interval [0, T]. Consider an iV-person non-cooperative differential game: 
the fe-th player (which may be called self-A;) starts the game from the initial pair (tfc_i, X(tfc_i)) 
and controls the system on [tfc-i) ^fc]; to minimize his own cost functional. At t = t^., the next player 
(the {k + l)-th player, or self-(fc -|- 1)) takes over, starting from the initial pair {tk,Xj^{tk)) which 
is the terminal pair of the fc-th player, and controlling the system on [tfc,tfc_|_i], etc. Each player 
knows that the later players will do their best, and will modify their control systems as well as their 
cost functionals. However, in measuring the performance of the controls, each player will discount 
the cost /payoff in his/her own way. This is the main issue in handling the time-inconsistency, and 
it also has to be treated this way so that the results can recover those for exponential discounting 
situations. It is expected that as the mesh size ||A|| = maxjtfc — t^^i | 1 < < N} — > 0, 
the Nash equilibrium strategy to the iV-person differential game should approach to the desired 
time-consistent solution of the original time-inconsistent Problem (N). 

The rest of the paper is organized as follow. In section 2, we collect some preliminary results, 
mainly some careful estimates relevant to our time-inconsistent optimal control problem. Section 3 
is devoted to a study of iV-person differential game. In Section 4, we will discuss the convergence of 
Nash equilibrium value function for the A^-person differential game, as well as a sufficient condition 
for the existence of time-consistent equilibrium control for Problem (N) . Finally, a time-inconsistent 
LQ problem will be presented. 

2 A^-Person Differential Games 

Consider the following linear controlled ODE parameterized by [t^x) G [0,T] x R": 

X{s) = A{t,x,s)X{s) + B{t,x,s)u{s), se[t,T], ^^^^ 
. X{t) = X, 
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with the cost functional 



J{t,x;u{-)) = {G{t,x)X{T),X{T)) 

l-T (2.2) 
+ j [{ Q{t, X, s)X{s),X{s) ) + ( R{t, X, s)u{s), u{s) ) ] ds. 

Here A, B, Q, R and G are some given suitable maps. Let A be a partition of [0,r] given by 

A:0 = to<ti<---<tN = T. 

We now introduce an A^-person differential game associated with A. These N players are labeled 
by fc = 1, 2, • • • , A^. The fc-th player chooses controls from U[tk-i,tk\. Any ,iijv(-)) G 

U[to,h] X ■■■ X U[tN-i,tN] is identified with u'^{-) G 1{[0,T] where 

u^{s) = Uk{s), s G [tfc_i, tfc), 1 < A; < iV. (2.3) 

Now, for any {x,u^{-)) eW^ x 1{[0,T], let X^{-) be the solution to the following: 

' X^(s) = A{tk-i,X^{tk-i),s)X^{s) + B{tk-i,X'^{tk-i),s)u^{s), 

se{tk-i,tk), l<k<N, (2-4) 

I X^{0) = x, 
The k-ih. player has the following cost functional: 

Jk{u^{-)) = Jk{ui{-),--- ,un{-)) = Jitk-l,X^{tk-l),U^{-)) 

^(G(t,_i,X^(t,_i))X^(T),X^(T)) (2.5) 
+ f [{ X^(ife_i), s)X^(s), X^(s) )+( X^(tik-i), s)n^(s), t.^(s) ) ] ds. 

For any x G K" and any partition A of [0,T], we now pose the following problem. 

Problem (LQ^). Find a control u^{-) = (ni(-),--- ,nAr(-)) e U[0,T] such that for each 
k = l,2,--- ,N, 

Jk{u^i-)) = Jk{ui{-), - ■ ■ ,Uk-i{-),Uk{-),Uk+i{-), - ■ ■ ,un{-)) (2 6) 

Any control u^{-) satisfying the above is called an equilibrium, con trol of Problem (LQ'^). The 
corresponding state trajectory X^{-) and the pair (A'^(-), u^(-)) arc called an equilibrium state 
trajectory and an equilibrium pair of Problem (LQ^), respectively. 

We now introduce the following assumptions. 

(HI) The maps A : [0,r] x [0,T] ^ M"^", B : [0,T] x [0,r] ^ M"^™, Q : [0,T] x [0,T] 5", 
R : [0, T] X [0, r] — >■ 5™, and G : [0, T] — )• 5" are continuous. There exist constants L,6 > such 
that 

P(t, s) - A{r, s)\\ + \\B{t, s) - B{r, s)\\ + ||Q(t, s) - Q{r, s)\\ 

+ \\R{t,s)-R{r,s)\\ + \\Git)-G{r)\\<L\t-r\, s,t,re[0,T]. 
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and 



Q{t,s),G{t) >0, R{t,s)>51, yt,se[0,T]. 
(H2) The maps G{-), Q{- , •), and R{- , •) satisfy the following: 

G{t)<G{r), Q{t,s) <Q{r,s), R{t,s) < R{r,s), VO < t < r < s < T. 
For any partition A of [0,r], we denote 

TV N 

k=l k=l 

N N 



(2.8) 



(2.9) 



k=l 



k=l 



Our first result is the following. 



Theorem 2.1. Let (HI) hold. For any partition A of [0, T] and any x G W^, Problem (LQ^) 
admits a unique equilibrium pair {X^{-),u^{-)). Moreover, X^{-) and u^{-) are linked by the 
following: 

u^{s) = -R^{sy'^B^{sfP^{s)X^{s), se[0,T], (2.10) 
where -P^(-) is the unique solution to the following Riccati equation: 

P^(s) + P^{s)A^{s) + A^{sfP^{s) + Q%) 

-P^(s)5^(s)i?^(s)-iS^(s)^P^(s) = 0, se {tk-i,tk), 
P^{tk - 0) = ^^{tN-, tkfG{tk-i)^^itN] tk) (2.11) 

+ ($^(s; tkfQ{tk-us)^^{s- tk) + *^(5; tkfR{tk-i, s)^'^{s; tk)) ds, 

l<k<N, 

with $^(- ;tjfc) {0 < k < N — 1) being the solution to the following: 



^f{s-tk) = [A^{s)-B^is)R^is)~'B^is)'P'^is) $^(s;ife), s G {tk,T] 



(2.12) 



and 



*^(5;tfc) = -P^(5)-ip^(5)P^(s)$^(s;tfc), s e [tk,T]. (2.13) 
The equilibrium state trajectory X^{-) is the solution to the following closed-loop system: 



X^(s) = A'^is) - P^(s)P^(s)-^P^(s)' P^(s) X^(s), s e [0, T], 
^ X{0) = X, 



(2.14) 
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and the equilibrium pair {X^{-),u^{-)) can be explicitly represented by the following: 
Moreover, 

< P^(t) < Po^(i), te[0,T], (2.16) 
where Po^(-) is the unique solution to the following Lyapunov equation: 



Po^is) + P(f + ^^(sf P(f (^) + = 0, se {tk-i, tk), 

P^itk - 0) = $^(iiv;tfe)^G(tfe-i)$^(tAr;tfe) 

+ / [^^{s;tkfQ{tk-i,s)^'^{s;tk) + ^^{s-,tkfR{tk-i,s)^'^{s-,tk)]ds, 
Jtk 

l<k<N, 



(2.17) 



We point out that the solution P^(-) of Riccati equation (2.11) and the soluionP(^(-) of Lya- 
punov equation (2.17) have possible jumps at ^ = 1) 2, • • • , iV — 1. 

Proof. Let X G and A : = to < < • • • < iiv = r be given. Let {X^{-),u^{-)) be an 
equilibrium pair of Problem (LQ^). Then the restriction of which on [tN-i,tN] is the optimal pair 
of the LQ problem for Player N on [tN-i,tN], with the state equation 

X^(s) = A^(s)X^(s) + P^(s)nAr(s), s G [tN-iM, 18) 

and with the cost functional 

Jn{ui{-),--- ,UN-l{-),UNi-)) = (G7vX^(t^),X^(t^)) 

+ / [{Q^{s)X^{s),X^{s)) + {R'^{s)uN{s),UN{s))]ds, 

Jtisf-l 

where Gn = G{tN-i)- To study this LQ problem, we consider the following state equation: 

XN{s) = A^{s)XNis) + B^{s)uN{s), se[t,tN], ^2 20) 

XN{t) = y, 

where {t,y) G [tN-i,tN) x 1^", with the cost functional 

JN{t,y;UN{-)) = {GNXN{tN),XN{tN)) 

rt^ (2.21) 

+ [{Q^{s)XN{s),XN{s)) + {R^{s)uN{s),UN{s))]ds, 
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For such an LQ problem on [t,tN], under (HI), there exists a unique optimal control which must 
have the following form: 



where P'^{ ) is the unique solution to the following Riccati equation: 

~P^is)B^is)R^is)-'B^isfP^is) = 0, (tN-utN), 

[ P^itN) = Gn, 

and Xn{-) is the solution to the following closed-loop state equation: 

Xn{s) = [A'^is) - B^{s)R'^{s)-'B^{sfP^{s)]xN{s), s G [t,tN], 
[ XNit) = y. 

Let <I>^(- ;t) be the solution to the following: (note that t G [ijv_i,ijv]) 



(2.22) 



(2.23) 



(2.24) 



(2.25) 



$f(s;t) = [a^{s) - B^is)R^is)-'B^isfP^is)]^^{s;t), s G (i,tiv], 
and denote 

*^(s; t) = -P^(s)-i5^(s)P^(s)$^(s; t), s G [t, ijv]- (2.26) 
Then the optimal pair {Xn{-), un{-)) of LQ problem (on [t, In]) admits the following representation: 



XNis) = ^^{s;t)y, 



s G [t,tN]- 



(2.27) 



Further, 



{P^{t)y,y) = Jjv(t,y;ujv(-)) = (GjvX^(tjv),Xjv(tjv)) 

/tjV 
[ ( QitN-1, s)XNis),XNis) ) + ( RitN-1, s)uNis),UNis) )]ds 

+*^(s; tfR{tN-i, s)*^(s; t))ds] y,y). 
Since y G M" can be arbitrarily chosen, we have 

+^'^{s;tfR{tN-i,s)^^is;t)]ds, t G (tiv-i,tiv]. 
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(2.28) 



(2.29) 



(2.30) 



(2.31) 



Also, by the optimality of un{-), we have 

{P^{t)y,y) = JN{t,y;uN{-)) < JN{t,y;0) 

( QitN-1, s)X°(s), X0(s) )ds = { P^it)y, y ) , 
where X^{-) is the solution to the following: 

X^is) = A''{s)X^{s), se[t,tN], 

X'it)=y, 

and P^{-) is the solution to the following Lyapunov equation: 

P^is) + Po^is)A^{s) + A'^isfPQ^is) + Q^(5) =0, s G (tiv-i,tiv), 

Pi'{tN-0) = GN, 

which can be represented by the following: 

P(f (i) = ^^{tN;tfGN^titN;t) + J^''^' ^^{s,tfQ{tN-i,s)^^{s;t)ds, t G [t,tN], (2.33) 
with $^(- ; t) being the solution to the following: 



(2.32) 



d 



ds 

Note that ^q{- ;t) can be defined for any t G [0,tN), which will be used below. Hence, 

0<P^{t)<P^{t), te{tN-i,tN]. 



(2.34) 



(2.35) 



It is also clear that the restriction of the equilibrium pair {X^{-),u'^{-)) on {tN-i,tN] admits the 
following representation: 



X^{s) = ^^{s;tN-i)X'^itN-i), 



S G [tN-l,tN]- 



I u'^is) = ^^{s;tN-i)X^{tN-i), 
Next, for Player (A'^ — 1), inspired by the above, we consider the following state equation: 

Xn-i{s) = A^{s)Xn-i{s) + B^{s)uN-i{s), s G [t,tjv-i], 



(2.36) 



(2.37) 



XN-i{t) = y, 
where {t,y) G [tjv-2, ^at-i) x W^. Let us denote 

A^_i(s) = $^(s;t^_i)A,v_i(t^_i), 

Un-i{s) = *^(s;tAr_i)XAr_i(tAr_i), 
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s G [ijv-i,ijv]- 



(2.38) 



Thus, is optimal pair for Player N starting from the initial pair (ijv_i, Xjv-i 

(tjv-i)). The cost functional for the LQ problem of Player {N — 1) on [t,tN-i] is taken to be 

JN-i{t,y;uN-i{-)) 

[{Q(tN-2,s)XN-lis),XN-l{s) ) + {R{tN-2,s)uN-lis),UN-l{s) )]ds 

+ / [{Q{tN-2,s)X^_i{s),X^_,{s)) + {R{tN-2,s)u^_i{s),u^_,{s))]ds 

JtM-1 [2.6\)) 

+ {G{tN-2)X^{tN),X^{tN)) 



tN-l 



[ { Q^{s)Xn-i{s),Xn-i{s) ) + ( R^{s)uN-i{s),UN-i{s) ) ] ds 

+ {GN-lXN_i{tN-l),XN-l{tN-l) ), 



where 



Gn-1 = ^'^{tN;tN-lfG{tN-2)^^(.tN;tN-l) 

+ j tN-ifQ{tN-2,s)^^{s- iiv-i)+*^(s; tN-ifR{tN-2, s)^^{s; tN-l)] ds. 

For such an LQ problem (on [t,tN-i]), under (HI), the optimal control is given by 

UN-i{s) = -R^{s)-^B^{sfP^{s)XN-i{s), s e [t,tN-l], 
where -P^(-) is the solution to the following Riccati equation: 

P^{s) + P^(s)A^(s) + A'^isfP^is) + Q^is) 

-P^{s)B^{s)R^{sr'B^{sfP^{s) =0, se {tN-2,tN-l), 

and Xjv-i(-) is the solution to the following closed-loop state equation: 

iN-i{s)= [a^{s) - B^{s)R'^{s)-'B^{sfP^{s)]xN^i{s), se[t,tN-i], 

[ XN-i{t) = y- 

Now, similar to (2.25), for t G [iiv-2, ^w-i]) let $^(- ;t) be the solution to the following: 
$f(s;t) = [a^{s) - B^is)R^{s)-'B^isfP^is)]<^^{s;t), s G {tM, 

and denote 

^^{s; t) = -i?^(s)-i5^(s)P^(s)$^(s; t), se [t, tN]. 



(2.40) 



(2.41) 



(2.42) 



(2.43) 



(2.44) 



(2.45) 
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It is clear that for t G [tN-2,tN-i], 



^^{s;t) = ^'^(s;tiv-i)$^(iiv-i;i), 



S e [tN-l,tN], 



(2.46) 



and the optimal pair {Xn-i{-),un-i{-)) of the LQ problem associated with (2.37) and (2.39) (on 
[t,ijv-i]) is given by the following: 



s e [t,tN-i]- 



(2.47) 



(2.48) 



UN-i{s) = *^(s;t)y, 

Hence, the restriction of the equilibrium pair {X^(-),u'^{-)) on [iiv-2,^iv] admits the following 
representation: 

< _ S e [tN-2,tN\- 

Further, 

{P^{t)y,y) = JN-i{t,y;uN-i{-)) = {GN-iXN-i{tN-i),XN-i{tN-i)) 

/tN-l 
[{Q{tN-2,s)XN-l{s),XN-lis)) + {R{tN-2,s)uN-lis),UN-l{s) )]ds 

/*]V-1 
[^'^{s;tfQ{tN-2,s)^'^is;t) 

+*^(s; tfR{tN-2, s)*^(s; t)] ds] y, y ) 

/tjV 
{<^^is;tfQitN-2,s)<^'^is;t) 

+*^(s; tfR{tN-2, s)^^{s; t))ds] y,y) . 
Since y G can be arbitrarily chosen, we have 



(2.49) 



+*^(s; tfR{tN-2, s)^^{s; t)] ds, t G (tjv-2, ijv-i). 
Also, by the optimality of ujv-i(-)) we have 

{P^{t)y,y) = JN-i{t,y;uN-i{-)) < Jiv-i(i,y;0) 

( Q(t^_2 , s)^° (s) , X° (s) ) = ( P(f (t)y , y ) , 

where X°(-) is the solution to the following: 

X^{s) = A^is)X\s), se[t,tN-i], 
[ X\t) = y, 
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(2.50) 



(2.51) 



(2.52) 



and P^{-) is the solution to the following Lyapunov equation: 

' P,^{S) + Po''{s)A^(.s) + A^isfPo'^is) + =0, se {tN-2, tN-l), 



(2.53) 

, Po'iiN-i — 0) = Gn-1, 



which, similar to the above, admits the following representation: 

It 

Hence, 



Po^(t) = $^(iiv-i;t)^GAr-i$^(tiv-i;i)+j^*'''W(5,*f<5^(5)^^(5;*)^^s> te[t,tN-i], (2-54) 



< P^(i) < P^{t), t G {tN-2, tN-l). (2.55) 
Then one can apply induction to complete the proof. □ 

3 Time- Consistent Solutions 

We now pose the following problem. 

Problem (LQ). For any given x € W\ find a control u(-) € ZY[0,r] satisfying the following: 
For any £ > 0, there exists a 6 > such that for any partition A of [0,T] with ||A|| < 5, one has 

Jk{u{-))<Jk{u^{-)) + e. (3.1) 

Any control n(-) G ^[0, T] satisfying the above is called an equilibrium control of Problem (LQ). 
The corresponding state trajectory X{-) and the pair {X{-),u{-)) are called an equilibrium state 
trajectory and an equilibrium pair of Problem (LQ), respectively. 

The following gives a weaker notion of time-consistent solutions to Problem (LQ). 

Definition 3.1. A control u{-) € U[0,T] is called a weak equilibrium control of Problem (LQ) 
if there exists a sequence of partitions of [0,T] with ||A„^|| so that for any £ > 0, there 
exists an mo > such that 

MO) < Jk{u^"'{-))+£, Vm > mo. (3.2) 

Our next goal is to find the limit as the mesh size ||A|| of A approaches to zero. For this, we 
need (H2). 

Theorem 3.1. Let (H1)~(H2) hold. Then for any partition A of [0,r], 

< P^(0 < P^{t) < Po^(t), t e [0, T], (3.3) 
where Po^(-) is the unique solution to the following Lyapunov equation: 

h%) + PoHs)AHs) + ^^(5)^Po^(5) + Q%) = 0, se (0, tN), (34^ 

[ P(f (iiv) = G(tN-l). 
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(3.5) 



(3.6) 



Consequently, -P^(-) is bounded uniformly in A. 
Proof. Recall that for A; = 1, 2, • • • , iV - 1, 

Po^itk - 0) = $^(iiv;tfe)^G(tfe_i)$^(iiv;ife) 

/■*JV 

+ / [$^(s; tkfQ{tk-i, s)$^(s; i^) + ^^{s; tkf R{tk-i, s)*^(s; t^)] ds, 
Jtk 

and (making use of the monotonicity of 1 1->- G{t), t ^ Q{s,t), and t R{s,t)) 

P^{tk + 0) > P^(tfe + 0) = #^(iiv; ifc)^G(ifc)$^(iiv; tfc) 

+ / [$^(s; tfe)^Q(tfe, s)$^(s; t^) + *^(s; tfe)^i?(tfe, s)*^(s; t^)] ds 

> ^^{tN-, tkfG{tk-i)^^{tN; tk) + / [$^(s; tkfQ{tk-i,s)^^{s; tk) 

Jtk 

+5'^(s; tkfR{tk-i, s)^^{s; tk)] ds = P^{tk - 0) = Po^(ife - 0). 

Note that 

and due to (making use of (3.6) for A; = iV — 1) 

P^itN-l - 0) < P^itN-1 + 0) < P^{tN-l + 0) = Po'^(tiV-l) = P^itN-1 - 0), (3.8) 

we have 

P^{t) < P^{t), t G {tN-2,tN-l). (3.9) 
Then by induction, we can obtain 

Po^{t)<Po^{t), te[0,tN]- (3.10) 

By the boundness of A{- , •) and Q{- , •), we have the boundness of P^{-) uniformly in A. Hence, 
we complete the proof. □ 

We see that -P^(-) has a possible jump at each tk, with the jump size: 

AP^itk) ^ P^itk + 0) - P^{tk - 0) 

= $^(tjv;tfc)^[G(ife) - G{tk-i)]^^{tN;tk) 

rtN (3.11) 
+ / [^^{s;tkf{Q{tk,s)-Q{tk-i,s))^^{s;tk) 
Jtk 

+^^{s;tkf{R{tk,s) - R{tk-i,s))^^{s;tk)]ds > 0. 

By (H1)-(H2), we have 

||AP^(tfc)|| < K{tk - tk-i) < K\\A\\. (3.12) 
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Next, we define -P^(-) as follows: 

P^(t)=P^(i), t€{tN-utN], 



tk — tk-l 

P^(tfc) = P^(ifc + 0), l<A;<iV-l. 



(3.13) 



Then {-P^(-)} is uniformly bounded and equicontinuous. Hence, we may assume that along a 
certain sequence with ||A^|| — )■ 0, 



lim P^-(.) = P(.), 



for some P(-). Also, wc have 



\\PHt) - PHm < ^ ma^ , l|AP^(ife)|| < i^llAII ^0, as ||A|| ^ 0. 

l<fc<A' — 1 



Hence, we have 



(3.14) 

(3.15) 
(3.16) 



lim ||P^-(-) -P(-)|| = 0. 
Next, it is clear that 

lim hA%)-A{s,s)\\ + \\B%)-B{s,s)MQ%)-Q{s,s)MR%)-Ris^ (3-17) 



||A|KO 
Hence, 



lim ||$^(s;t)-$(s;i)|| = 0, 

||A|KO 



(3.18) 



with ;t) being the solution to the following: 

^,{s;t)= A{s,s) - B{s,s)R{s,s)-^B{s,s)'^P{s) ^{s;t), sG(t,tjv], 

Consequently, P(-) satisfies the following: 



(3.19) 



P{t) = $(T; tfG{tMT; t)+£ [^s; tfQ{t, s)$(s; t) ^3^0) 
+$(s; tfP{s)B{s, sfR{s, s)-^P(t, s)R{s, s)~^B{s, s)P(s)$(s; t)] ds, t G (0, T). 

Denote 

A{s) = A{s,s), B{s) = B{s,s), R{s) = R{s,s). 
Then we have the following system of forward-backward Volterra integral equations: 

$(s; t) = I + [A{r) - B{r)R{r)-^B{rfP{r)] $(r; t)dr, s G [t, T], 

pT C3 21) 

P{t) = ^T-tfGm{T-t)+J^ mr-tfQ{t,rMr-t) 

+<^{r;tfP{r)B{rfR{r)-^R{t,r)R{r)-^B{r)P{r)<^{r;t)]dr, t G [0,r]. 
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Suppose the above admits a unique solution ($(• ; •), P(-)). Then 



lim ||P^(.)-P(-)||=0, 

A H-0 



(3.22) 



and 




(3.23) 



Then 



lim 
||A||->0 



(3.24) 



with 



^ u{s) = -R{s)-^B{s)P{s)X{s) = -R{s)-^B{s)P{s)^s;0)x, 



se[0,T]. 



(3.25) 



Hence, for any e > 0, there exists a (5 > such that for any partition A of [0, T], as long as || A|| < S, 
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Appendix. 

Let us now solve Problem (C) explicitly. For any given {t,x) G [0,r) x R", according to a 
standard LQ theory, in the current case, the corresponding Riccati differential equation reads 

P{s)-P{s)^ = 0, se[t,T], 
P(T) = h{t), 
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Clearly the solution of the above Riccati equation depends on t. Hence, we denote it by P{- 
Simple calculation shows that 



Pis;t) 



h{t) 



i + h{t){T-sy 

The optimal control trajectory is the solution to the following closed- loop system 

X{s) = -P{s-t)X{s), se[t,T], 
I X{t) = X, 



which is given by 



, l + h(t)(T-s) 

X(s; t, x) = X -^-^7 f , 

^ ^ l + h{t){T-t) ' 



s e \i,T 



and the optimal control is given by 

u{s;t,x) = —P{s;t)X{s;t,x) = — 

Now, if we let 



xh{t) 



l + h{t)iT-t)' 



J{t-T,y-u{-))= u{sfds + h{t)X{T-T,y,u{-)f, t e[t,T], 
then the optimal value function (for fixed t) is given by 

V{t-T,y)= inf J{t-T,y-u{-)) = J{t-T,v-u{-)) = P{r\t)v^ 

u{-)&A\t,T] 



{A.2) 



(^.3) 



{AA) 



{A.h) 



{A.&) 



h{l\ 



{A.7) 



+ V(r,,).[^,r]xM. 
Next, let r G {t,T), we consider Problem (C) on [r, T] with initial state 

y = X(r;t,x)=xi±M(^. (A8) 

The same as above, we see that the corresponding solution to the Riccati equation is given by 

h{r) 



and 



i + h{T){T-sy 

u(-)eW[r,T] 

^ X^h{T)[l + hit)iT - t)]^ 

[i+h{T){T-T)][i+hmT-tw 



se[r,T], 

l + h{T){T-T) 



iA.9) 



(AAO) 



However, 



rT 

J{t, y- u{-)) = J u{sfds + h{T)X{T; r, y, u{-)f 



+ Kt) 



xh{t){T -t) 



l + h{t){T-t) 



x'h{tY{T-T) 
[l + h{t){T~t)f 

x'^h{tf{T~T) 

[i+h{t){T-t)f ' [i+h{t){T-t)r 
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(All) 



+ 



X^h{T) 



Hence, 



J{T,y;u{-)) 

x'^h{tf{T-T)+x'^h{T 



inf J(r, 



x'^h{T)[l + h{t){T -T)f 



X 



[1 + h{t){T - [1 + h{T){T - r)][l + h{t){T - tW 

[h{t)'iT - r) + h{r)][l + Mt)(T - r)] - /.(r)[l + /.(t)(r - r)]^ 



(^.12) 



[l + /.(r)(r-r)][l + MO(r-t)F 
x2[Mr)-M0]'(r-T) 



[l + hiT)iT-T)][l + hit)iT-tW 



>0, 



unless /i(t) = h{t) or x = 0. 



This shows that the restriction of u{- ; t, x) on [r, T] is not necessarily optimal for Problem (C) with 
initial pair (r, X(r; i, x)). Hence, Problem (C) is time-inconsistent. 
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Consider 



with cost functional 



X{s) = uis), se[t,T], 
[ X{t) = X, 

)) = / u{sfds + h{t)X(T;t, 



(3.27) 



(3.28) 



Let A : = to < ^1 < ^2 < ■ • ■ < tj^-i < tj^ = T he a partition of [0, T]. Consider an LQ problem 
on [tN-i,tN], with the state equation 



Xn{s) = Un{s), S e [tN-l,tN], 



(3.29) 



and cost functional 

JNitN-l,X;UN{-)) = / UN{sfds + h(tN-l)XNitNf 

tjV-1 

The corresponding Riccati differential equation reads 



(3.30) 



P^{s)-P^{sf = 0, s€[tN-l,tN], 
P^'itN) = h{tN-l), 



(AA) 



Simple calculation shows that 

P^(iiv) 



h{tN-i) 



l + P^{tN){tN-s) l + h{tN-l){tN-s)' 

The optimal control trajectory is the solution to the following closed-loop system 

Xn{s) = -P^{s)Xn{s), s e [tN-lM, 

^ XN{tN-l) = X, 



seltN^iM- (^-2) 



iA.3) 



which is given by 



l + P^(tjv)(tiv-s) 



Xn{s) = X- 



l + /l(t7V-l)(tjV -s) 



1 + P^{tN){tN - tN-l) 1 + h{tN-l){tN - tN-l) ' 

and the optimal control is given by 

xP^(tjv) xh{tN-i) 



se[tN-i,tN], (A A) 



UNis) = -P^ is)XNis 



1 + P^{tN){tN - tN-l) 1 + h{tN-l){tN - tN-l) ' 

Now, on [tjv-2, ^JV-i], we consider state equation 

Xn-i{s) = UN-l{s), S e [tN-2,tN-l], 

, XN-litN-2) = X, 
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se[t,T]. 
(A.5) 



(3.31) 



with the cost functional 



Jn-I {tN-2, X; ■UiV-l (•)) 



2, 



un-i{s) ds + un{s) ds + h{tN-2)X(t 

tN-2 JtN-l 



- ( \, r,^2, I h{tN-l)\tN -tN-l) + h{tN-2) .2 

For the corresponding LQ problem, the Riccati equation is 



3Af-l/„A2 



Simple calculation shows that 



(s) =0, S e [tN-2,tN-l], 

P^{tN)HtN - tN-l) + h{tN-2) 
[l + P^{tN){tN-tN-l)? ■ 



is) = 



N-1 - S) 



(*JV )^ (*JV -tAf- 1 )+fc(tjV-2 ) 
[l+P^{tAr)(tiv-tAr-i)]2 



T , P^itNV{tN-tN-l)+HtN-2) r^. 



[l+P'^tN^tN-tN-lW 

P^{tN)HtN - tN-l) + h{tN-2) 



[1 + P^(tAr)(tiv - tN-iW + [^'^(iAr)2(tAr - tiv-i) + /t(tiv-2)](tiv-i - ' 

/t(tjV-l)^(tjv - tjv-i) + h{tN-2) 

[1 + h{tN-l){tN - tN-lW + [h{tN-iy{tN - tN-l) + h{tN-2)]{tN-l - s) ' 

S G [tN-2, tN-l]- 



Note 



P"" (tN-l) -P''-' (tN-l) 



N-1, 



P^{t 



N 



P^{tN)'\tN -tN-l) + hitN-2) 



1 + P^(tiv)(iiV - tN-l) [1 + P''{tN){tN - tN-l)? 

P^'(^v)[l + P-^X^aO(/a- - /a-_i)] - P''{^^)-{l^ - l^-l) - /K/A--2) 

[l + P^{tN){tN-tN-l)? 
P^itN) - hitN-2) h{tN-l) - h{tN-2) 



[1 + P^{tN){tN - tN-l)? [1 + h{tN-l){tN - tN-l)? 

The optimal trajectory is the solution to the following closed-loop system 



Xn-i{s) = -P''-\s)Xn-i{s) 

XN-litN-2) = X, 



S G [tN-2, tN-l], 



which is given by 



Xn-i{s) = X 



l + P''^-\tN-l){tN-l-s) 
l + pN-l(tN-l){tN-l-tN-2) ' 
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S e [tN-2, tN-l], 



and the optimal control is given by 

UN-i{s) = -P''-\s)Xn-i{s) 



xP^-\tN-l) 



s(.[t,T]. 



l + P^-i(iiv-i)(ijv-i-iiv-2)' 
Now, on [tN-3,tN-2], we consider state equation 

Xn-2{s) = UN-2{s), S G [tN-3,tN-2], 

, XN-2{tN~3) = X, 

with the cost functional 

JN-2{tN-3,X;UN-2{-)) 

rtN-2 r^N-i r^N 

"'*JV-3 ■'tjV-2 JtN-l 



rtN-2 



/ ^N-2{S) ds^ — -r^XN-2{tN-2) 

JtN-3 [l + F" '-(tN~l)(tN-l - tN-2)\ 



P''{tN?{tN-tN-l) + h{t 



N-3) 



[l + P^itN){tN-tN-l)? 

\2l 



XN-litN-lf' 
1 — tN-2) 



XN-2{tN-2y 



P'^tNYitN - tN-l) + h{tN-3) 



[1 + P^{tN){tN - tN-l)?[l + P^-l(iiV-l)(tiV-l - tN-2W 



XN-2{tN- 



, P''{tN?{tN-tN-l) + h{tN-2) .2 
UN-1{S) ds + X{tN-l) 

tN-2 [1 + P {tN){tN - tN-l)r 

For the corresponding LQ problem, the Riccati equation is 



p^-\s) - P^-1(S)2 =0, se [tN-2,tN^l], 

P^{tN?{tN - tN-l) + h{tN-2) 



P^'-HtN-l) = 

Simple calculation shows that 

P^'-HtN-l) 



[l + P^{tN){tN-tN-lW 



{s) = 



l + pN-l^tN-l){tN-l-s) 



P'^ {tN)^(tN-tN-l)+h{tN-2) 

\l + P^'(tN){fN-tN-lW 



1 + 



[l+P^itN){tN-tM- 



P^itNfitN - tN-l) + hitN-2) 



[1 + P^{tN){tN - tN-lW + [P^'itNyitN - tN-l) + hitN-2)]{tN-l - s) ' 

h{tN-l)'^{tN - tN-l) + KtN-2) 

[1 + h{tN-l){tN - tN-l)f + [h{tN-l)HtN - tN-l) + h{tN-2)]{tN-l - s) ' 

S G [tN-2,tN-l]- 
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Note 



P^(tAr-i)-P^-i(tiV-i) 

1 + P^{tN)itN - tN-l) [1 + P''itN)itN - tN-lW 

^ P^{tN)[l + P^(tiv)(tjV - tN-l)] - P^'itNfitN - tN-l) - h{tN-2) 

[l + PN{tN){tN-tN-l)Y 

P^{tN)-h{tN-2) h{tN-l) - hitN-2) 



[1 + P^{tN){tN - tN-l)? [1 + h{tN-l)itN - tN-l)? 

The optimal trajectory is the solution to the following closed-loop system 

liV-l(s) = -P^-\s)Xn-i{s), S e [tN-2,tN-l], 

XN-iitN-2) = X, 



which is given by 



Y ( \ l + P''-KtN-l)itN-l-s) 

Xn-i[s) =X r>N-l(+ ^77 7 7' S e[tN-2,tN-l\ 

1 + t"^ [tN-l ) [tw-l - tN-2 ) 



and the optimal control is given by 

^.-lis) = -p--\s)XN-iis) = s e [t,T] 



xP^'-HtN-l) 
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Now, if we let 

J{t-T,y-u{-))= u{sfds + h{t)X{T;T,y,u{-)f, r e [t,T], 

then the optimal value function (for fixed t) is given by 

V{t;T,y)= inf J{t;T,y;u{-)) = J{t;T,y;u{-)) = P{T;t)y^ 
u(-)eU[T,T] 



(A.Q) 



h{t) 



{A.7) 



:y\ V(r,y)G[i,r]x 



l + h{t){T-Ty 

Next, let r G {t,T), we consider Problem (C) on [r, T] with initial state 

l + h(t)(T-T) 
y = X(t; t, x) = X V^t r- 

The same as above, we see that the corresponding solution to the Riccati equation is 



and 



«(-)eW[r,T] 

_ x'^h{T)[l + h{t){T -t)]'^ 

- [l + h{T)iT-T)][l + h{t){T-t)f 



se[T,T], 

l + h{T){T-T) 



iA.8) 
jiven by 

(A.9) 



{AAO) 



However, 



J{t, y- u{-)) = J u{sfds + h{T)X{T- r, y, u{-)f 



+ h{T) 



xh{t){T-T) 



l + h{t){T-t) 



x'h{ty(T-T) 
[l + h{t){T-t)]^ 

X^h{tf{T-T) 

[l + h{t){T-t)f ' [1 + hmT-t)]^- 



(All) 



+ 



x2/i(r) 



Hence, 



J{r,y;u{-)) - inf J(r,y;«(-)) 

u{-)£U[t,T] 



x'h{tY{T -T)+x'h{T) 



X^h{T)[l + h{t){T ^T)f 



= X 



[1 + h{t){T - t)Y [1 + h{T){T - r)][l + h{t){T - t)Y 

[h{t)\T -t) + h{T)][l + h{T){T - r)] - h{T)[l + h{t){T - rf 



(A12) 



[l + h{T){T-T)][l + h{T){T-t)f 

X^[h{T) - h{t)\\T - t) 



[l + h{T){T-t)][l + h{t){T-T)f 



>o. 



unless /i(t) = h{t) or x = 0. 



This shows that the restriction of u{- ; t, x) on [r, T] is not necessarily optimal for Problem (C) with 
initial pair (r, X(t; x)). Hence, Problem (C) is time-inconsistent. 
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